MAPTIME030 / geodata in R

GreenStat - Peter van Horssen
'02 maart, 2018'

Introduction - Who am I

Peter van Horssen www.greenstat.nl

background in physical geography & ecology
experience in data analysis/statistics/mapping in ecological studies

  • animal tagging studies
  • impact assessment studies
  • data visualisation

analysis : statistics, GIS, graphs, maps

Introduction - What is the Plan ?

  • Intro

    • geo-data in R
  • Basics:

    • convert non-spatial tot spatial data in R
    • reading and writing

do stuff yourself

…. coffee break ….

  • Analysis:
    • some examples

do more stuff yourself

geo-data in R

Why GIS in R ?
or maybe start with : why R?

  • R is a usefull big box with all tools needed for data-analysis
  • R is not better than other tools
  • R does not make your data-analysis problem easier/simple !

  • creation of reproducible workflow

  • scripting and documentation of workflow

  • Recently a boost in the available 'spatial' tools (based on 'simple features') All spatial analysis functionality (and more!) is available in R, plotting and layout for maps is at a basic level.

  • this presentation provides 'pointers' only

geo-data in R

what is geo-data?

  • not all data is geodata
  • formats (xlsx,csv, shp,gml,kml,…..)
  • meta-data (map projection, units)

geo-data in R

'simple features' is a data model with basic features for all spatial data:

  • ISO standard for a common storage and access model of mostly two-dimensional geometries (point, line, polygon, multi-point, multi-line, etc.) used by geographic information systems. link

This standard is 'under the hood' used in nearly all GIS packages (QGis, Esri, PostGIS, ..)

it also exports all OGC operations:
st_area, st_buffer, st_length, st_transform,….

geo-data in R

  • why do we need this ?
  • who needs this ?

r&d for small to midsize (in terms of data) analysis

very usefull for data-exploration (spatial, temporal, attributes)

keep the workflow on one platform

analysis in R, fancy (web)mapping somewhere else

geo-data in R

Assume: User with basic knowledge of R

data.frame, x[] , str()

User with workable knowledge of spatial analysis and map projections

Software: R-core
packages :

  • sf, rgdal, raster
  • mapview, tmap
  • tidyverse

Please download this before the meetup, R lives @ cran : https://cran.r-project.org/bin/windows/base/

Package are installed when R is running, choose 'Package|Install Packages' in the topbar, choose a cloudsource and select package name

geo-data in R

sf : package to handle spatial vector data efficiently (https://r-spatial.github.io/sf/) tidyverse: alle you need data exploration tools (https://www.tidyverse.org/packages/)

  • ggplot : graphs in every form
  • dplyr : tools for data manipulation
  • much more but we skip that for now

Non spatial > spatial data

test set simple example: points with coordinates

head(df)
  ID      var2 var1      dates        X        Y
1  1 0.3474181    D 2016-05-30 4.473945 52.03298
2  2 0.1076117    D 2016-08-20 5.034427 52.78106
3  3 0.4108196    D 2016-07-30 7.115377 52.75334
4  4 0.2413287    B 2016-06-25 5.033752 53.19410
5  5 0.8936213    A 2016-03-26 4.207123 52.05226
6  6 0.2559923    D 2016-01-13 6.602886 51.06076
# script voor test df
n=10^3

df <- data.frame(
  ID=c(1:n),
  var2=runif(n),
  var1=sample(LETTERS[1:4], n, replace=TRUE),
  dates=sample(seq(as.Date('2016/01/01'), as.Date('2017/01/01'), by="day"), n, replace=TRUE),
  X  = runif(n, min= 3.36,max= 7.23), # why this min/max?
  Y  = runif(n, min=50.72,max=53.55)
  )

Non spatial > spatial data

library(sf)

df.sf <- st_as_sf(df,coords=c('X','Y'))

df.sf
Simple feature collection with 1000 features and 4 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 3.36023 ymin: 50.72252 xmax: 7.221368 ymax: 53.54322
epsg (SRID):    NA
proj4string:    NA
First 10 features:
   ID      var2 var1      dates                  geometry
1   1 0.1911551    B 2016-03-19 POINT (5.944756 50.74802)
2   2 0.4594477    B 2016-06-02 POINT (5.355002 51.36075)
3   3 0.6065719    A 2016-08-22 POINT (4.910865 52.96492)
4   4 0.5212724    C 2016-06-21 POINT (6.172521 52.09474)
5   5 0.0396683    A 2016-06-16 POINT (5.321477 52.37966)
6   6 0.2919543    C 2016-03-13 POINT (4.474442 52.23473)
7   7 0.3235874    C 2016-11-20 POINT (7.099024 53.25873)
8   8 0.9087451    A 2016-03-17 POINT (5.424322 51.55503)
9   9 0.4612557    A 2016-08-02 POINT (4.036561 50.73775)
10 10 0.7023028    A 2016-06-14 POINT (4.471334 52.10025)
#st_as_sf(df,coords=c('X','Y','var2'), dim="XYZ")
#head(df.sf,n=2)
#head(df,n=2)

Non spatial > spatial data

mapprojection - metadata

in sf map projections through CRS (Coordinate Reference System) at http://spatialreference.org/

Non spatial > spatial data

df.sf <- st_as_sf(df,coords=c('X','Y'), crs=4326)

df.sf
Simple feature collection with 1000 features and 4 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 3.36023 ymin: 50.72252 xmax: 7.221368 ymax: 53.54322
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs
First 10 features:
   ID      var2 var1      dates                  geometry
1   1 0.1911551    B 2016-03-19 POINT (5.944756 50.74802)
2   2 0.4594477    B 2016-06-02 POINT (5.355002 51.36075)
3   3 0.6065719    A 2016-08-22 POINT (4.910865 52.96492)
4   4 0.5212724    C 2016-06-21 POINT (6.172521 52.09474)
5   5 0.0396683    A 2016-06-16 POINT (5.321477 52.37966)
6   6 0.2919543    C 2016-03-13 POINT (4.474442 52.23473)
7   7 0.3235874    C 2016-11-20 POINT (7.099024 53.25873)
8   8 0.9087451    A 2016-03-17 POINT (5.424322 51.55503)
9   9 0.4612557    A 2016-08-02 POINT (4.036561 50.73775)
10 10 0.7023028    A 2016-06-14 POINT (4.471334 52.10025)

Non spatial > spatial data

str(df.sf) # str : shows structure of object
Classes 'sf' and 'data.frame':  1000 obs. of  5 variables:
 $ ID      : int  1 2 3 4 5 6 7 8 9 10 ...
 $ var2    : num  0.1912 0.4594 0.6066 0.5213 0.0397 ...
 $ var1    : Factor w/ 4 levels "A","B","C","D": 2 2 1 3 1 3 3 1 1 1 ...
 $ dates   : Date, format: "2016-03-19" "2016-06-02" ...
 $ geometry:sfc_POINT of length 1000; first list element: Classes 'XY', 'POINT', 'sfg'  num [1:2] 5.94 50.75
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA
  ..- attr(*, "names")= chr  "ID" "var2" "var1" "dates"
str(df)
'data.frame':   1000 obs. of  6 variables:
 $ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ var2 : num  0.1912 0.4594 0.6066 0.5213 0.0397 ...
 $ var1 : Factor w/ 4 levels "A","B","C","D": 2 2 1 3 1 3 3 1 1 1 ...
 $ dates: Date, format: "2016-03-19" "2016-06-02" ...
 $ X    : num  5.94 5.36 4.91 6.17 5.32 ...
 $ Y    : num  50.7 51.4 53 52.1 52.4 ...

objects keep 'dataframe' class

Non spatial > spatial data

df.sf[1:3,] # first three rows of dataframe
Simple feature collection with 3 features and 4 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 4.910865 ymin: 50.74802 xmax: 5.944756 ymax: 52.96492
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs
  ID      var2 var1      dates                  geometry
1  1 0.1911551    B 2016-03-19 POINT (5.944756 50.74802)
2  2 0.4594477    B 2016-06-02 POINT (5.355002 51.36075)
3  3 0.6065719    A 2016-08-22 POINT (4.910865 52.96492)
#df.sf[,3]
df.sf[1:3,2:3] # first three rows and column 2 and 3 only
Simple feature collection with 3 features and 2 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 4.910865 ymin: 50.74802 xmax: 5.944756 ymax: 52.96492
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs
       var2 var1                  geometry
1 0.1911551    B POINT (5.944756 50.74802)
2 0.4594477    B POINT (5.355002 51.36075)
3 0.6065719    A POINT (4.910865 52.96492)

Non spatial > spatial data

library(mapview) # R wrapper for leaflet .....


#mapview(df.sf)
mapview(df.sf,zcol="var1")


#library(tmap)
#tmap_mode("view")
#tm_shape(df.sf) + tm_dots(col="black", size=.1)

geo-data in R

do stuff yourself ….

…coffee break…

import/export external GIS-formats

maps downloaded from web CBS : buurt_2017.zip

  • kaart met wijken, buurten en gemeenten in NL
  • uitpakken: shape files
list.files("../dataUtrecht/buurt_2017")
 [1] "buurt_2017.cpg"     "buurt_2017.dbf"     "buurt_2017.prj"    
 [4] "buurt_2017.shp"     "buurt_2017.shp.xml" "buurt_2017.shx"    
 [7] "gem_2017.cpg"       "gem_2017.dbf"       "gem_2017.prj"      
[10] "gem_2017.shp"       "gem_2017.shp.xml"   "gem_2017.shx"      
[13] "wijk_2017.cpg"      "wijk_2017.dbf"      "wijk_2017.prj"     
[16] "wijk_2017.shp"      "wijk_2017.shp.xml"  "wijk_2017.shx"     

Utrecht : bomenkaart.zip

  • kaart met bomen in utrecht
  • uitpakken : shapefile met bomen in Utrecht
list.files("../dataUtrecht/bomenkaart")
[1] "Bomen_GISIB_ArcGISonline.dbf" "Bomen_GISIB_ArcGISonline.prj"
[3] "Bomen_GISIB_ArcGISonline.sbn" "Bomen_GISIB_ArcGISonline.sbx"
[5] "Bomen_GISIB_ArcGISonline.shp" "Bomen_GISIB_ArcGISonline.shx"

geo-data in R

library(sf)
library(tidyverse)
buurt.sf <- st_read("../dataUtrecht/buurt_2017/buurt_2017.shp")
Reading layer `buurt_2017' from data source `L:\GreenStat\projecten\2017-X03 maptime 030 R GIS\dataUtrecht\buurt_2017\buurt_2017.shp' using driver `ESRI Shapefile'
Simple feature collection with 13308 features and 39 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 10425.16 ymin: 306846.2 xmax: 278026.1 ymax: 621876.3
epsg (SRID):    NA
proj4string:    +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
buurt.sf %>% head(n=3)
Simple feature collection with 3 features and 39 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 251260.5 ymin: 592402 xmax: 254809.1 ymax: 594573.5
epsg (SRID):    NA
proj4string:    +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
     BU_CODE            BU_NAAM  WK_CODE GM_CODE    GM_NAAM IND_WBI WATER
1 BU00030000 Appingedam-Centrum WK000300  GM0003 Appingedam       1   NEE
2 BU00030001    Appingedam-West WK000300  GM0003 Appingedam       1   NEE
3 BU00030002    Appingedam-Oost WK000300  GM0003 Appingedam       1   NEE
  POSTCODE DEK_PERC  OAD STED AANT_INW AANT_MAN AANT_VROUW P_00_14_JR
1     9901        1 1190    3     2335     1090       1245         10
2     9903        5  894    4     3080     1535       1545         17
3     9902        1 1112    3     5955     2865       3090         16
  P_15_24_JR P_25_44_JR P_45_64_JR P_65_EO_JR P_ONGEHUWD P_GEHUWD
1          9         21         30         30         40       36
2         11         20         33         19         43       47
3         11         22         27         25         43       41
  P_GESCHEID P_VERWEDUW BEV_DICHTH AANTAL_HH P_EENP_HH P_HH_Z_K P_HH_M_K
1         11         12       2774      1310        54       28       18
2          7          4       1950      1335        27       37       36
3          9          8       2094      2735        35       31       34
  GEM_HH_GR P_WEST_AL P_N_W_AL P_MAROKKO P_ANT_ARU P_SURINAM P_TURKIJE
1       1.7         6        4         0         1         0         1
2       2.3         6        3         0         1         0         1
3       2.1         9        9         1         1         1         4
  P_OVER_NW OPP_TOT OPP_LAND OPP_WATER                       geometry
1         2      90       84         5 MULTIPOLYGON (((253641.5 59...
2         1     163      158         5 MULTIPOLYGON (((251260.5 59...
3         3     295      284        11 MULTIPOLYGON (((254580.7 59...

geo-data in R

library(sf)
library(tidyverse)

buurt.sf %>% str() # show structure of object
Classes 'sf' and 'data.frame':  13308 obs. of  40 variables:
 $ BU_CODE   : Factor w/ 13308 levels "BU00030000","BU00030001",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ BU_NAAM   : Factor w/ 12251 levels "'n Oaln Diek",..: 372 374 373 9715 10782 10735 786 9585 12185 11064 ...
 $ WK_CODE   : Factor w/ 3159 levels "WK000300","WK000500",..: 1 1 1 1 1 1 2 2 2 2 ...
 $ GM_CODE   : Factor w/ 389 levels "GM0003","GM0005",..: 1 1 1 1 1 1 2 2 2 2 ...
 $ GM_NAAM   : Factor w/ 388 levels "'s-Gravenhage",..: 20 20 20 20 20 20 28 28 28 28 ...
 $ IND_WBI   : num  1 1 1 1 1 1 1 1 1 1 ...
 $ WATER     : Factor w/ 3 levels "B","JA","NEE": 3 3 3 3 3 3 3 3 3 3 ...
 $ POSTCODE  : Factor w/ 3907 levels "-99999999","1011",..: 3830 3832 3831 3832 3831 3830 3782 3782 3784 3784 ...
 $ DEK_PERC  : num  1 5 1 4 1 1 1 1 1 1 ...
 $ OAD       : num  1190 894 1112 351 74 ...
 $ STED      : num  3 4 3 5 5 5 4 5 5 5 ...
 $ AANT_INW  : num  2335 3080 5955 320 100 ...
 $ AANT_MAN  : num  1090 1535 2865 165 50 ...
 $ AANT_VROUW: num  1245 1545 3090 150 50 ...
 $ P_00_14_JR: num  10 17 16 21 17 24 16 21 14 17 ...
 $ P_15_24_JR: num  9 11 11 11 8 11 12 12 11 10 ...
 $ P_25_44_JR: num  21 20 22 23 14 15 21 27 17 16 ...
 $ P_45_64_JR: num  30 33 27 35 45 33 29 26 38 39 ...
 $ P_65_EO_JR: num  30 19 25 10 17 17 22 13 21 17 ...
 $ P_ONGEHUWD: num  40 43 43 50 40 46 43 48 43 41 ...
 $ P_GEHUWD  : num  36 47 41 44 55 47 45 47 45 51 ...
 $ P_GESCHEID: num  11 7 9 3 3 6 6 4 8 7 ...
 $ P_VERWEDUW: num  12 4 8 2 2 1 6 1 4 2 ...
 $ BEV_DICHTH: num  2774 1950 2094 59 18 ...
 $ AANTAL_HH : num  1310 1335 2735 115 40 ...
 $ P_EENP_HH : num  54 27 35 18 15 21 30 18 34 19 ...
 $ P_HH_Z_K  : num  28 37 31 32 48 37 34 38 37 46 ...
 $ P_HH_M_K  : num  18 36 34 50 38 42 36 45 29 35 ...
 $ GEM_HH_GR : num  1.7 2.3 2.1 2.7 2.5 2.7 2.3 2.7 2.2 2.5 ...
 $ P_WEST_AL : num  6 6 9 6 2 8 4 6 5 5 ...
 $ P_N_W_AL  : num  4 3 9 0 1 1 3 2 4 4 ...
 $ P_MAROKKO : num  0e+00 0e+00 1e+00 -1e+08 -1e+08 ...
 $ P_ANT_ARU : num  1e+00 1e+00 1e+00 -1e+08 -1e+08 ...
 $ P_SURINAM : num  0e+00 0e+00 1e+00 -1e+08 -1e+08 ...
 $ P_TURKIJE : num  1e+00 1e+00 4e+00 -1e+08 -1e+08 ...
 $ P_OVER_NW : num  2e+00 1e+00 3e+00 -1e+08 -1e+08 ...
 $ OPP_TOT   : num  90 163 295 559 582 769 313 2190 74 699 ...
 $ OPP_LAND  : num  84 158 284 540 554 ...
 $ OPP_WATER : num  5 5 11 18 28 13 5 14 3 6 ...
 $ geometry  :sfc_MULTIPOLYGON of length 13308; first list element: List of 1
  ..$ :List of 1
  .. ..$ : num [1:72, 1:2] 253642 253617 253599 253593 253602 ...
  ..- attr(*, "class")= chr  "XY" "MULTIPOLYGON" "sfg"
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "names")= chr  "BU_CODE" "BU_NAAM" "WK_CODE" "GM_CODE" ...

geo-data in R

library(sf)
library(tidyverse)

u.buurt.sf <-
  buurt.sf %>% filter(GM_NAAM=='Utrecht') %>% select(GM_NAAM,BU_NAAM,AANT_INW)

u.buurt.sf %>% head(n=3)
Simple feature collection with 3 features and 3 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 133872 ymin: 454563.4 xmax: 135198.4 ymax: 456063.6
epsg (SRID):    NA
proj4string:    +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
  GM_NAAM                BU_NAAM AANT_INW                       geometry
1 Utrecht Welgelegen, Den Hommel     1395 MULTIPOLYGON (((135198.4 45...
2 Utrecht              Oog in Al     4280 MULTIPOLYGON (((134877.6 45...
3 Utrecht        Halve Maan-Zuid     1435 MULTIPOLYGON (((134161.3 45...

geo-data in R

library(sf)
library(tidyverse)
library(mapview)

u.buurt.sf %>% mapview()
#u.buurt.sf %>% mapview(zcol="BU_NAAM")
u.buurt.sf %>% mapview(fill=NA)

geo-data in R

library(sf)
library(tidyverse)
bomen.sf <- st_read("../dataUtrecht/bomenkaart/Bomen_GISIB_ArcGISonline.shp")
Reading layer `Bomen_GISIB_ArcGISonline' from data source `L:\GreenStat\projecten\2017-X03 maptime 030 R GIS\dataUtrecht\bomenkaart\Bomen_GISIB_ArcGISonline.shp' using driver `ESRI Shapefile'
Simple feature collection with 168736 features and 10 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 126735.1 ymin: 448930 xmax: 141827.3 ymax: 461162.1
epsg (SRID):    NA
proj4string:    +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
bomen.sf %>% head(n=3)
Simple feature collection with 3 features and 10 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 134185.4 ymin: 457536.2 xmax: 134346.6 ymax: 458042.9
epsg (SRID):    NA
proj4string:    +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
           Naam_NL         Naam_Wet Plantjaar          Eigenaar
1             Kers Prunus 'Umineko'      2017      Gemeentelijk
2 Japanse sierkers Prunus serrulata      2017      Gemeentelijk
3       Schietwilg       Salix alba        NA Niet gemeentelijk
                Buurt      Wijk  Boomnr Boomnr_Oud X_coordina Y_coordina
1           Elinkwijk Noordwest 2938159       <NA>     134185     457957
2   Edisonstraat e.o. Noordwest 2938160       <NA>     134292     458042
3 Schaepmanplein e.o. Noordwest 2938161       <NA>     134346     457536
                   geometry
1 POINT (134185.4 457957.2)
2 POINT (134292.8 458042.9)
3 POINT (134346.6 457536.2)

Non spatial > data exploration `tidy-style`

www.tidyverse.org/packages/

use of 'pipes' : '%>%'

select : select column
filter : filter rows
mutate : add column (and mutate value)
summarize : aggregate

library(tidyverse)
df <- df %>% mutate(maand=format.Date(dates, "%m"))
df %>% head()
  ID      var2 var1      dates        X        Y maand
1  1 0.1911551    B 2016-03-19 5.944756 50.74802    03
2  2 0.4594477    B 2016-06-02 5.355002 51.36075    06
3  3 0.6065719    A 2016-08-22 4.910865 52.96492    08
4  4 0.5212724    C 2016-06-21 6.172521 52.09474    06
5  5 0.0396683    A 2016-06-16 5.321477 52.37966    06
6  6 0.2919543    C 2016-03-13 4.474442 52.23473    03
df %>% filter(var1=="B" ) %>% head()
  ID      var2 var1      dates        X        Y maand
1  1 0.1911551    B 2016-03-19 5.944756 50.74802    03
2  2 0.4594477    B 2016-06-02 5.355002 51.36075    06
3 12 0.1905957    B 2016-07-11 6.935469 51.74871    07
4 14 0.3848420    B 2016-01-02 5.704743 52.39669    01
5 15 0.7399423    B 2016-10-04 6.705704 51.05816    10
6 18 0.3100369    B 2016-08-01 5.364002 50.79384    08

Non spatial > data exploration `tidy-style`

df %>% 
ggplot(aes(dates,ID)) +
geom_point() + 
theme(text = element_text(size = 25))

plot of chunk unnamed-chunk-16

df %>%
ggplot(aes(maand,var2, group=maand)) +
geom_boxplot()+ 
theme(text = element_text(size = 25))

plot of chunk unnamed-chunk-17

Non spatial > data exploration `tidy-style`

df %>% ggplot(aes(var1,var2)) +
geom_boxplot() +
facet_wrap(~maand) +            # conditional plot
theme(text = element_text(size = 25))

plot of chunk unnamed-chunk-18

df %>% ggplot(aes(maand,var2)) +
geom_boxplot() +
facet_wrap(~var1)+ 
theme(text = element_text(size = 25))

plot of chunk unnamed-chunk-19

spatial analysis - view selection

df.sf[1:3,]  
Simple feature collection with 3 features and 4 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 4.910865 ymin: 50.74802 xmax: 5.944756 ymax: 52.96492
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs
  ID      var2 var1      dates                  geometry
1  1 0.1911551    B 2016-03-19 POINT (5.944756 50.74802)
2  2 0.4594477    B 2016-06-02 POINT (5.355002 51.36075)
3  3 0.6065719    A 2016-08-22 POINT (4.910865 52.96492)
# selection returns sf object ..

#df.sf[,3]
#df.sf[1:3,2:3]
mapview(nl01.sf) + df.sf[1:10,"var1"] 

# 

spatial analysis - view selection

library(mapview)
library(sf)
library(tidyverse)

# select by spatial object .....!
df.sf[nl01.sf,] 

mapview(nl01.sf) + df.sf[nl01.sf,]

#what if ?
# nl01.sf[df.sf,]

#nl01.sf[nl01.sf$NAME_1 =="Noord-Brabant",]
nbr.sf <-
  nl01.sf %>% filter(NAME_1 =="Noord-Brabant")

df.sf[nbr.sf,] 

mapview( nbr.sf)+ df.sf[nbr.sf,]

# neat !

spatial analysis - buffer

nbr.sf %>% st_crs()
#Coordinate Reference System:
#  EPSG: 4326 
#  proj4string: "+proj=longlat +datum=WGS84 +no_defs"
nbr.sf %>%
  st_transform(28992) %>%  
                  # what if we don't do this ?
  st_buffer(dist=10000) %>%
  mapview() + nbr.sf # what happens here ....?


# st_area ?
# st_length?

spatial analysis - spatial join

df.sf[nl01.sf,] %>%     
      st_join (nl01.sf %>% select(NAME_1))

#Simple feature collection with 475 features and 6 fields
#Geometry type:  POINT
#dimension:      XY
#bbox:           xmin: 3.52603 ymin: 50.76231 xmax: 7.163211 ymax: 53.48051
#epsg (SRID):    4326
#proj4string:    +proj=longlat +datum=WGS84 +no_defs
#First 10 features:
#   ID      var2 var1      dates  m        NAME_1                       geometry
#2   2 0.4210747    B 2016-02-14 02 Noord-Brabant POINT (4.82705890548415 51....
#3   3 0.3327989    D 2016-04-16 04       Utrecht POINT (5.44536982017802 52....
df.sf[nl01.sf,] %>%                             #select points IN nl01.sf
      st_join (nl01.sf %>% select(NAME_1)) %>%    # spatial join & select column 'NAME_1'
      mapview(zcol="NAME_1")                    # display the result

spatial analysis - spatial join & ggplot

library(sf)
library(tidyverse)

df.sf[nl01.sf,] %>% 
  st_join (nl01.sf %>% select(NAME_1)) %>%
  ggplot(aes(NAME_1,var2)) + geom_boxplot() +
  theme(axis.text.x = element_text(angle = 60, hjust = 1))+ 
  theme(text = element_text(size = 15))

spatial analysis - puzzle !

aggregate points on a regular grid and count
plot map with # points per grid

library(sf)
library(tidyverse)
library(mapview)

#? st_make_grid

fishnet.sf <-
nl01.sf %>%
  st_transform(28992) %>%
  st_make_grid(cellsize=25000) %>%
  st_sf() %>%
  mutate(fid=row_number()) 

#mapview(fishnet.sf,zcol="fid")
point_aggregate.sf <-
df.sf %>%
  st_transform(28992) %>%
  st_join(fishnet.sf ) %>%
  group_by(fid) %>%             #
  summarize(n=n()) %>%          # count number 
                                # of points
  st_centroid()                 # huh? ?st_centroid

#plot maps on top of each other 
mapview(nl01.sf, fill=NA) +
mapview(fishnet.sf, fill=NA) +
  mapview(point_aggregate.sf, cex="n") 

spatial analysis

do stuff yourself